Goto

Collaborating Authors

 machine learning approach


Machine Learning Approaches to Clinical Risk Prediction: Multi-Scale Temporal Alignment in Electronic Health Records

Chang, Wei-Chen, Dai, Lu, Xu, Ting

arXiv.org Artificial Intelligence

This study proposes a risk prediction method based on a Multi-Scale Temporal Alignment Network (MSTAN) to address the challenges of temporal irregularity, sampling interval differences, and multi-scale dynamic dependencies in Electronic Health Records (EHR). The method focuses on temporal feature modeling by introducing a learnable temporal alignment mechanism and a multi-scale convolutional feature extraction structure to jointly model long-term trends and short-term fluctuations in EHR sequences. At the input level, the model maps multi-source clinical features into a unified high-dimensional semantic space and employs temporal embedding and alignment modules to dynamically weight irregularly sampled data, reducing the impact of temporal distribution differences on model performance. The multi-scale feature extraction module then captures key patterns across different temporal granularities through multi-layer convolution and hierarchical fusion, achieving a fine-grained representation of patient states. Finally, an attention-based aggregation mechanism integrates global temporal dependencies to generate individual-level risk representations for disease risk prediction and health status assessment. Experiments conducted on publicly available EHR datasets show that the proposed model outperforms mainstream baselines in accuracy, recall, precision, and F1-Score, demonstrating the effectiveness and robustness of multi-scale temporal alignment in complex medical time-series analysis. This study provides a new solution for intelligent representation of high-dimensional asynchronous medical sequences and offers important technical support for EHR-driven clinical risk prediction.


Macroscopic Emission Modeling of Urban Traffic Using Probe Vehicle Data: A Machine Learning Approach

Adlouni, Mohammed Ali El, Jin, Ling, Xu, Xiaodan, Spurlock, C. Anna, Lazar, Alina, Sadabadi, Kaveh Farokhi, Amirgholy, Mahyar, Asudegi, Mona

arXiv.org Artificial Intelligence

Urban congestions cause inefficient movement of vehicles and exacerbate greenhouse gas emissions and urban air pollution. Macroscopic emission fundamental diagram (eMFD)captures an orderly relationship among emission and aggregated traffic variables at the network level, allowing for real-time monitoring of region-wide emissions and optimal allocation of travel demand to existing networks, reducing urban congestion and associated emissions. However, empirically derived eMFD models are sparse due to historical data limitation. Leveraging a large-scale and granular traffic and emission data derived from probe vehicles, this study is the first to apply machine learning methods to predict the network wide emission rate to traffic relationship in U.S. urban areas at a large scale. The analysis framework and insights developed in this work generate data-driven eMFDs and a deeper understanding of their location dependence on network, infrastructure, land use, and vehicle characteristics, enabling transportation authorities to measure carbon emissions from urban transport of given travel demand and optimize location specific traffic management and planning decisions to mitigate network-wide emissions.


Predicting Antibiotic Resistance Patterns Using Sentence-BERT: A Machine Learning Approach

Alwakeel, Mahmoud, Yarrington, Michael E., Wrenn, Rebekah H., Fang, Ethan, Pei, Jian, Chowdhury, Anand, Wong, An-Kwok Ian

arXiv.org Artificial Intelligence

Abstract: Antibiotic resistance poses a significant threat in in - patient settings with high mortality. Using MIMIC - III data, we generated Sentence - BERT embeddings from clinical notes and applied Neural Networks and XGBoost to predict antibiotic susceptibility. XGBoost achieved an average F1 score of 0.86, while Neural Networks scored 0.84. This study is among the first to use document embeddings for predicting antibiotic resistance, offering a novel pathway for improving antimicrobial stewardship. Introduction: Sepsis and septic shock are life threatening conditions, with mortality rates as high as 50 - 60%. [1] Delays in appropriate antibiotic administration lead to an 8% decrease in survival for every hour of delay, underscoring the need for prompt and precise treatment.


Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music

Kim, Alexander, Botha, Charlotte

arXiv.org Artificial Intelligence

For singers of all experience levels, one of the most fun and daunting challenges in learning, technical repertoire is navigating placement and vocal register in and around the passagio (passage between chest voice and head voice registers). Contemporary Pop and Musical Theater solos increasingly demand strong command through and above the first passagio, and the use of various timbre and textures to achieve a desired quality. Thus, it can be difficult to identify what vocal register within the vocal range a singer is using even for advanced vocalists. This paper presents two methods for classifying vocal registers in an audio signal of male pop music through the end-to-end analysis of textural features of mel-spectrogram images. Additionally, we will discuss the practical integration of these models for vocal analysis tools, and introduce a concurrently developed software called AVRA which stands for Automatic Vocal Register Analysis. Our proposed methods achieved consistent classification of vocal register through both Support Vector Machine (SVM) and Convolutional Neural Network (CNN) models, which shows promise for robust classification possibilities across a greater range of voice types and genre.


A Machine Learning Approach to Predict Biological Age and its Longitudinal Drivers

Dunbayeva, Nazira, Li, Yulong, Xie, Yutong, Razzak, Imran

arXiv.org Artificial Intelligence

Predicting an individual's aging trajectory is a central challenge in preventative medicine and bioinformatics. While machine learning models can predict chronological age from biomarkers, they often fail to capture the dynamic, longitudinal nature of the aging process. In this work, we developed and validated a machine learning pipeline to predict age using a longitudinal cohort with data from two distinct time periods (2019-2020 and 2021-2022). We demonstrate that a model using only static, cross-sectional biomarkers has limited predictive power when generalizing to future time points. However, by engineering novel features that explicitly capture the rate of change (slope) of key biomarkers over time, we significantly improved model performance. Our final LightGBM model, trained on the initial wave of data, successfully predicted age in the subsequent wave with high accuracy ($R^2 = 0.515$ for males, $R^2 = 0.498$ for females), significantly outperforming both traditional linear models and other tree-based ensembles. SHAP analysis of our successful model revealed that the engineered slope features were among the most important predictors, highlighting that an individual's health trajectory, not just their static health snapshot, is a key determinant of biological age. Our framework paves the way for clinical tools that dynamically track patient health trajectories, enabling early intervention and personalized prevention strategies for age-related diseases.


A Machine Learning Approach for Honey Adulteration Detection using Mineral Element Profiles

Al-Awadhi, Mokhtar A., Deshmukh, Ratnadeep R.

arXiv.org Artificial Intelligence

This paper aims to develop a Machin e Learning (ML) - based system for detecting honey adulteration utilizing honey mineral element profiles. The proposed system comprises two phases: preprocessing and classification. The preprocessing phase involves the treatment of missing - value attributes a nd normalization. In the classification phase, we use three supervised ML models: logistic regression, d ecision tree, and random forest, to discriminate between authentic and adulterated honey. To evaluate the performance of the ML models, we use a public dataset comprising measurements of mineral element content of authentic honey, sugar syrups, and adulterated honey. Experimental findings show that mineral element content in honey provides robust discriminative information for detecting honey adulteration . Results also dem onstrate that the random forest - based classifier outperforms other classifiers on this dataset, achieving the highest cross - validation accuracy of 98.37%.


A Machine Learning Approach to Generate Residual Stress Distributions using Sparse Characterization Data in Friction-Stir Processed Parts

Shaikh, Shadab Anwar, Balusu, Kranthi, Soulami, Ayoub

arXiv.org Artificial Intelligence

Residual stresses, which remain within a component after processing, can deteriorate performance. Accurately determining their full-field distributions is essential for optimizing the structural integrity and longevity. However, the experimental effort required for full-field characterization is impractical. Given these challenges, this work proposes a machine learning (ML) based Residual Stress Generator (RSG) to infer full-field stresses from limited measurements. An extensive dataset was initially constructed by performing numerous process simulations with a diverse parameter set. A ML model based on U-Net architecture was then trained to learn the underlying structure through systematic hyperparameter tuning. Then, the model's ability to generate simulated stresses was evaluated, and it was ultimately tested on actual characterization data to validate its effectiveness. The model's prediction of simulated stresses shows that it achieved excellent predictive accuracy and exhibited a significant degree of generalization, indicating that it successfully learnt the latent structure of residual stress distribution. The RSG's performance in predicting experimentally characterized data highlights the feasibility of the proposed approach in providing a comprehensive understanding of residual stress distributions from limited measurements, thereby significantly reducing experimental efforts.


Enhanced Drought Analysis in Bangladesh: A Machine Learning Approach for Severity Classification Using Satellite Data

Paul, Tonmoy, Mati, Mrittika Devi, Islam, Md. Mahmudul

arXiv.org Artificial Intelligence

Drought poses a pervasive environmental challenge in Bangladesh, impacting agriculture, socio-economic stability, and food security due to its unique geographic and anthropogenic vulnerabilities. Traditional drought indices, such as the Standardized Precipitation Index (SPI) and Palmer Drought Severity Index (PDSI), often overlook crucial factors like soil moisture and temperature, limiting their resolution. Moreover, current machine learning models applied to drought prediction have been underexplored in the context of Bangladesh, lacking a comprehensive integration of satellite data across multiple districts. To address these gaps, we propose a satellite data-driven machine learning framework to classify drought across 38 districts of Bangladesh. Using unsupervised algorithms like K-means and Bayesian Gaussian Mixture for clustering, followed by classification models such as KNN, Random Forest, Decision Tree, and Naive Bayes, the framework integrates weather data (humidity, soil moisture, temperature) from 2012-2024. This approach successfully classifies drought severity into different levels. However, it shows significant variabilities in drought vulnerabilities across regions which highlights the aptitude of machine learning models in terms of identifying and predicting drought conditions.


Evaluating Machine Learning Approaches for ASCII Art Generation

Coumar, Sai, Kingston, Zachary

arXiv.org Artificial Intelligence

Generating structured ASCII art using computational techniques demands a careful interplay between aesthetic representation and computational precision, requiring models that can effectively translate visual information into symbolic text characters. Although Convolutional Neural Networks (CNNs) have shown promise in this domain, the comparative performance of deep learning architectures and classical machine learning methods remains unexplored. This paper explores the application of contemporary ML and DL methods to generate structured ASCII art, focusing on three key criteria: fidelity, character classification accuracy, and output quality. We investigate deep learning architectures, including Multilayer Perceptrons (MLPs), ResNet, and MobileNetV2, alongside classical approaches such as Random Forests, Support Vector Machines (SVMs) and k-Nearest Neighbors (k-NN), trained on an augmented synthetic dataset of ASCII characters. Our results show that complex neural network architectures often fall short in producing high-quality ASCII art, whereas classical machine learning classifiers, despite their simplicity, achieve performance similar to CNNs. Our findings highlight the strength of classical methods in bridging model simplicity with output quality, offering new insights into ASCII art synthesis and machine learning on image data with low dimensionality.


A Machine Learning Approach for Design of Frequency Selective Surface based Radar Absorbing Material via Image Prediction

Sutrakar, Vijay Kumar, K, Anjana P, Kesharwani, Sajal, Bisariya, Siddharth

arXiv.org Artificial Intelligence

The paper presents an innovative methodology for designing frequency selective surface (FSS) based radar absorbing materials using machine learning (ML) technique. In conventional electromagnetic design, unit cell dimensions of FSS are used as input and absorption coefficient is then predicted for a given design. In this paper, absorption coefficient is considered as input to ML model and image of FSS unit cell is predicted. Later, this image is used for generating the FSS unit cell parameters. Eleven different ML models are studied over a wide frequency band of 1GHz to 30GHz. Out of which six ML models (i.e. (a) Random Forest classification, (b) K- Neighbors Classification, (c) Grid search regression, (d) Random Forest regression, (e) Decision tree classification, and (f) Decision tree regression) show training accuracy more than 90%. The absorption coefficients with varying frequencies of these predicted images are subsequently evaluated using commercial electromagnetic solver. The performance of these ML models is encouraging, and it can be used for accelerating design and optimization of high performance FSS based radar absorbing material for advanced electromagnetic applications in future.